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ABSTRACT 

The Statistical Reporting Service of the U.S. Department of 
Agriculture as a principle investigator for NASA, is evaluating 
ERTS-1 imagery as a potential tool for estimating crop acreage. 

The Statistical Reporting Service makes crop and livestock fore- 
casts and estimates throughout the year, across the U.S. A 
main data source for the estimates is obtained by enumerating 
small land parcels that have been randomly selected from the 
total U.S. land area. These small parcels are being used as 
ground observations in this investigation. The test sites 
are located in Missouri, Kansas, Idaho, and South Dakota. TTie 
major crops of interest are wheat, cotton, com, soybeans, sugar 
beets, potatoes, oats, alfalfa, and grain sorghum. Some of the 
crops are unique to a given site while others are common in two 
or three States. This provides an opportunity to observe crops 
grown under different conditions. Results for the Missouri test 
site are presented in this report. Results of temporal overlays, 
unequal prior probabilities, and sample classifiers are discussed. 
The amount of improvement that each technique contributes is 
shown in terms of overall performance. The results show that 
useful inforaation for making crop acreage estimates can be 
obtained from ERTS-1 data. 

INTRODUCTION 

SRS of the U.S. Department of Agriculture is the main fact -gathering 
agency of the USDA. The name of the agency has changed several times, 
but the objective of collecting and disseminating primary data on agricul- 
ture has remained the same for more than 100 years. Crop acreage and pro- 
duction as well as livestock, prices, labor, and farm expenditures are 
estimated. 

Many of these estimates are generated from a general purpose land 
area sample survey conducted in June and based on 17,000 segments selected 
at random from the total U.S. land area. This is a sample stratified by 
States and within states by land use. Segments for a State are defined 
within each category of land use or stratum and a sample of these segments 
is selected. Stratification by land use has made it possible to san?)le 
more efficiently for all items because sample segments are allocated to 
each stratum individually. At the time of field enumeration, the inter- 
viewer must be able to identify the boundaries of the sample segment and 
collect informaticHi which applies to the land inside these boundaries. 

ERTS imagery may also be helpful in stratification and in the segment 
selection process; we have not used ERTS for these pn?)oses yet, but plan 
to try this soon. 
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Keep in mind that we use these segments to generate livestock and 
price estimates as well as crop acreages, and for this reason, ERTS will 
not replace our present s>'stera for major items. Secondly, our estimates 
have saji^iling errors between 2 percent and 5 percent at the U.S. level, 
and between 5 and 10 percent at the state level for major commodities. 

We do not go much below the state level for our probability survey since 
the sample was not designed to provide estimates below the state level. 

PROCEDURES 

Twenty-nine segments of approximately one square mile size were 
located in two ERTS frames covering most of Crop Reporting Districe No. 

9 in Southeast Missouri. The segments are located over a 10,000 square 
mile area. Information on the crop and acreage of each field was obtained 
by SRS enumerators during the summer of 1972; this data has been used for 
training the classifier and testing its performance. ERTS data from three 
dates was included in the analysis. Data collected September 14 and 
October 2, 1972 was registered (overlaid) to data collected August 26, 1973. 
The temporal overlay alleviated the necessity of locating fields in three 
different data sets, as well as peimitted a test of the utility of tenporal 
data in the classification. 

The ERTS data was also geometrically corrected to facilitate locating 
the coordinates of segments and fields. In the geometric correction pro- 
cess the MSS data is rotated, deskewed, and scaled to 1/24,000 scale. 

The geometrically corrected data was overlaid on 1/24,000 scale topographic 
maps on which the segments had been outlined. The individual segments 
were then classified (clustered) using the ono- supervised classifier in 
LARSYS. Field coordinates were located on the map output from this 
classification. Final classifications were carried out using the supervised 
classifier in LARSYS .1/ 


RESULTS 

The results are presented in the form of a classification matrix. 
Table 1 shows the classification results obtained when using quadratic 
discriminant functions with equal prior probabilities. That is, it is 
assumed that the probability of occurence of com is the same as the pro- 
bability for cotton, and so forth. Because of the small size of the data 
set the whole data set was used in training the classifier. This is a 
nine channel classification with data from three ERTS passes. The four 
major classes, cotton, com, soybeans, and grass were classified 74, 59, 
40, and 57 percent correctly, respectively. Overall performance was 59 
percent. 
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The assumption of equal prior probabilities is many times 
not valid, but is frequently used because of lack of information. 
The prior probabilities used in this study came from an earlier 
8\irvey, the June 1972 Enumerative Survey. Other sources of 
prior probability information are historic data, for example, 
last year's farm census. Classification results using unequal 
prior probabilities are shown in Table 2. Comparing the results 
in Table 1 to those in Table 2 it is seen that the overall per- 
formance has been increased from 59 to 71 percent; and secondly, 
that the total number of points classified into each class is 
much closer to the actual number of points present. For example 
from Table 2 , the total number of points classified as cotton 
is 906 which is considerably closer to 927, the actual number 
present. The total number of corn points, 43 is rather close 
to the actual 58 present. For soybeans, the total of 866, is 
very close to the actual 852 present. Two hundred seventy- 
seven (277) points were classified as grass compared to 240 
actual points of this crop. Further, the statistical properties 
of estimates made on this basis are better since, if the assump- 
tion of normality for the data set is correct, and the prior 
probabilities are correct, we obtain unbiased estimates. 

Most classifications reported by other researchers have 
not used prior probabilities . While the overall error rate 
reported here is higher than reported by some researchers, this 
study was based on a statistical sampling of the entire land 
area in the study areas rather than on purposely selected test 
sites . 

Table 3 shows results of using a seunple classifier rather 
than a point classifier used in the above work. In a point classi- 
fier system each point in a field can be assigned to any of the 
groups . With the Scimple classifier all points in the field are 
assigned to the seime class or crop. One drawback to this proce- 
dure is that there were a large number of fields that were not 
classified because the technique requires p+1 data points in 
order to form the statistics necessary to assign it to a crop 
(where p is the length of the vector of measurements). However, 
if enough points are present, classification performance has 
generally been found to be better them for the point classifier. 

In the work we have done in Missouri using the sample classi- 
fier, about 40 percent of the fields were not classified because 
the required number of points for the classifier (10 in this 
particular case) exceeded the number of points present within the 
defined fields. Of the total number of fields 33 percent were 
correctly identified. Considering only those fields which were 
classified, 54 percent were classified correctly. 

In Missouri 71 percent of the fields were less than 20 acres, 
but account for 32 percent of the total area. In our Kansas 
site, 20 percent of the fields were less than 20 acres, but 
account for only 1.5 percent of the total land areas. In South 
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Dakota, 40 percent of the fields were less than 20 acres, and 
account for 15 percent of the area. In Idaho, 74 percent of 
the fields were less than 20 acres, and account for 25 percent 
of the area. If 20 acres is a critical field size for the 
classifier, we would expect to do well in making acreage esti- 
mates, in Kansas, but in Missouri only a little more than 50 
percent of the acreage would be accounted for. 

Next, the information gained from the temporal overlay is 
evaluated. In Table 4 classification results for single dates 
are compared to the multitemporal classifications already pre- 
sented. The overall classification performance was improved 
about 10 percent by the addition of temporal data with even 
greater improvement for several of the individual classes. 

DISCUSSION 

The results presented do not show the classification 
accuracy to be as high as that found by other investigators. 

The lower performance level is premarily attributed to the 
greater variation in crops, soils, and weather over a 10,000 
square mile area than is found over smaller areas. And, second- 
ly to the kind of crops which were being discriminated. Still, 
the classifications contain enough information to be useful in 
estimating crop acreages over large areas, particularly if re- 
gression or some other technique is used to improve the esti- 
mate . 


A regression estimator can be used to reduce the variance 
of the estimate. -For example, if a large area is classified 
and there is an r^ of .50 between the discriminant function 
classification and what the ground acreage data shows. We can 
adjust our area sample estimates by the fompl^te classified 
data and obtain a reduced variance of Ey (1-r ) /n(n-2) where r^ 
is the correlation coefficient squared. The estimate of the 
variance of the comparable statistic without using ERTS data is 
£y /n(h-l) which would be nearly twice as large when r^ = .50. 

If we were to classify a sample of points we would have a 
double s^unple and the variance would be: 

ly ^ ^y (r*) 

n(n-2) n m 

where n = the sample size from JES and m = the Seunple size from 
ERTS. 
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Table 1. Classification matrix of quadratic discriminant functions 
with equal prior probabilities using data from three over- 
flights,* Missouri Study Area. 

„ - Number of samples classified into 

no. or 

sample Percent 



points 

Correct 

Cotton 

Com 

Soybean 

Grass 

Misc 

Cotton 

927 

74.3 

689 

21 

83 

36 

98 

Com 

58 

58.6 

4 

34 

3 

10 

7 

Soybeans 

852 

39.7 

101 

49 

338 

137 

227 

Grass 

240 

57.1 

34 

22 

22 

137 

25 

Misc. 

140 

75.0 

14 

5 

7 

9 

105 

Totals 

2217 


842 

131 

453 

329 

462 


Overall performance 58.8 percent 


* August 26, 1972, MSS bands 4, 5, 7. 
September 14, 1972, MSS bands 5, 7. 
October 2, 1972, MSS bands 4, 5, 6, 7. 


Table 2. Classification matrix of quadratic discriminant functions 


with unequal prior probabilities using data from three 
overflights,* Missouri Study Area. 



No. of 
seunple 
points 

* T- 

Percent 

Correct 

Number of samples classified into 
Cotton Corn Soybean Grass Misc. 

Cotton 

927 

79.7 

739 

2 

137 

26 

23 

Corn 

58 

44.8 

9 

26 

7 

14 

1 

Soybeans 

852 

71.8 

99 

12 

612 

96 

23 

Grass 

240 

53.3 

42 

1 

66 

128 

2 

Misc. 

140 

89.3 

17 

2 

44 

13 

64 

Totals 

2217 


906 

43 

866 

277 

125 


Overall performance 70.8 percent 


* August 26, 1972, MSS bands 4, 5, 7. 
September 14, 1972, MSS bands 5, 7. 
October 2, 1972, MSS bands 4, 5, 6, 7. 
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T 2 d)le 3. Seunple classification matrix based on data from 3 overflights. 


Missouri Study Area. 


Group 

No. of 
fields 

Per- 

cent 

fields 

cor- 

rect 

No. of 
points 

Per 

cent 

points 

cor- 

rect 

Cotton 

Corn 

Soy 

beans 

Grass 

Misc. 

Not 

class- 

ified 

Cotton 

38 

63.2 

927 

85.0 

24 

0 

2 

0 

1 

11 

Corn 

7 

14.3 

58 

20.7 

0 

1 

0 

1 

1 

4 

Soybeans 

58 

25.9 

852 

44.2 

9 

3 

15 

3 

8 

20 

Grass 

31 

9.7 

240 

29.6 

3 

1 

1 

3 

2 

21 

Misc. 

9 

44.4 

140 

65.7 

1 

0 

1 

1 

4 

2 

Totals 

143 

32.9 

2217 

60.4 

37 

5 

19 

8 

16 

58 


Tadsle 4. Comparison of multitemporal classification performance to 
classifications of single dates. ^fissouri Study Area. 


Group 

Multi temporal 

Aug. 26 

Sept. 14 

Oct. 2 

Cotton 

79.7 

60.6 

69.7 

73.2 

Corn 

44.8 

10.3 

0.0 

12.1 

Soybeans 

71.8 

86.0 

67.6 

62.4 

Grass 

53.3 

8.3 

42.1 

27.9 

Misc. 

89.3 

31.4 

22.8 

17.9 

Overall 

70.8 

61.6 

61.1 

59.2 


^Unequal prior probabilities were used for all classifications. 

^August 26, 1972, MSS bands 4, 5, 7. 

September 14, 1972, MSS bands 5, 7. 

October 2, 1972, MSS bands 4, 5, 6, 7. 
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